Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

166

Applications in Computer Vision

FIGURE 6.8

(a) and (b) illustrate the distribution of the unbinarized weights wi of the 6-th 1-bit layer

in 1-bit PointNet backbone when trained under XNOR-Net and our POEM, respectively.

From left to right, we report the weight distribution of initialization, 40-th, 80-th, 120-th,

160-th, and 200-th epoch. Our POEM obtains an apparent bimodal distribution, which is

much more robust.

Weight distribution: The POEM-based model is based on an Expectation-Maximization

process implemented in PyTorch [186] platform. We compare the weight distribution of

training XNOR-Net and POEM, which can subtly conﬁrm our motivation. For a 1-bit

PointNet model, we analyze the 6-th 1-bit layer sized (64, 64) and having 4096 elements.

We plot its weight distribution at the {0, 40, 60, 120, 160, 200}-th epochs. Figure 6.8 shows

that the initialization (0-th epoch) is the same for XNOR-Net and POEM. However, our

POEM eﬃciently employs the Expectation-Maximization algorithm to supervise the back-

propagation process, leading to an eﬀective and robust bimodal distribution. This analysis

also complies with the performance comparison in Table 6.5.

6.4

LWS-Det: Layer-Wise Search for 1-bit Detectors

The performance of 1-bit detectors typically degrades to the point where they are not widely

deployed on real-world embedded devices. For example, BiDet [240] only achieves 13.2%

mAP@[.5, .95] on the COCO minival dataset [145], resulting in an accuracy gap of 10.0%

below its real value counterpart (on the SSD300 framework). The reason, we believe, lies in

the fact that the layer-wise binarization error signiﬁcantly aﬀects 1-bit detector learning.

TABLE 6.3

The eﬀects of diﬀerent components of POEM on OA.

1-bit PointNet

OA (%)

XNOR-Net

81.9

Proposed baseline network

83.1

Proposed baseline network + PReLU

85.0

Proposed baseline network + EM

86.2

Proposed baseline network + LSF

86.5

Proposed baseline network + PReLU + EM + LSF (POEM)

90.2

Real-valued Counterpart

89.2

Note: PReLU, EM, and LSF denote components that are introduced into our proposed

baseline network. The proposed baseline network + PReLU + EM + LSF denotes the

POEM we propose. LSF denotes the learnable scale factor, in short.